我们提出了GAAF(一种广义自动解剖器查找器),用于鉴定3D CT扫描中的通用解剖位置。GAAF是端到端管道,具有专用模块用于数据预处理,模型培训和推理。GAAF以核心使用自定义卷积神经网络(CNN)。CNN型号很小,轻巧,可以调整以适合特定应用。到目前为止,GAAF框架已经在头部和颈部进行了测试,并且能够找到解剖位置,例如脑干的质量中心。GAAF在开放式数据集中进行了评估,并且能够准确稳健地定位性能。我们所有的代码都是开源的,可在https://github.com/rrr-uom-projects/gaaf上找到。
translated by 谷歌翻译
腹部器官分割是一项艰巨且耗时的任务。为了减轻临床专家的负担,非常需要完全自动化的方法。当前的方法由卷积神经网络(CNN)主导,但是计算要求和对大数据集的需求限制了其在实践中的应用。通过实施小而高效的自定义3D CNN,编译训练的模型并优化计算图:我们的方法可产生高精度分割(骰子相似性系数(%):肝脏:97.3 $ \ pm 1.3,肾脏:94.8 $ \ pm $ 3.6,$ 3.6,,$ 3.6,,$ 3.6,,,$ 3.6,,,$ 3.6,,,$ 3.6,,$ \ pm $ 3.6,,肝气脾脏:96.4 $ \ pm $ 3.0,pancreas:80.9 $ \ pm $ 10.1),每张图像1.6秒。至关重要的是,我们能够仅在CPU上执行细分推断(无需GPU),从而在没有专家硬件的情况下便利地促进模型的简单和广泛部署。
translated by 谷歌翻译
使用卷积神经网络(CNNS)自动分割CT扫描中的器官 - AT风险(OARS),正在放疗工作流中。但是,这些细分仍需要在临床使用前进行临床医生的手动编辑和批准,这可能很耗时。这项工作的目的是开发一种工具,以自动识别3D OAR细分中的错误,而无需基础真相。我们的工具使用了结合CNN和图神经网络(GNN)的新型体系结构来利用分割的外观和形状。使用合成生成的腮腺分割数据集并使用逼真的轮廓错误的数据集对所提出的模型进行训练。通过消融测试评估我们的模型的有效性,评估了体系结构不同部分的功效,以及从无监督的借口任务中使用转移学习。我们最佳性能模型预测了腮腺上的错误,内部和外部错误的精度分别为85.0%和89.7%,召回66.5%和68.6%。该离线质量检查工具可以在临床途径中使用,有可能减少临床医生通过检测需要注意的区域来纠正轮廓的时间。我们所有的代码均可在https://github.com/rrr-uom-projects/contour_auto_qatool上公开获得。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Transformers are powerful visual learners, in large part due to their conspicuous lack of manually-specified priors. This flexibility can be problematic in tasks that involve multiple-view geometry, due to the near-infinite possible variations in 3D shapes and viewpoints (requiring flexibility), and the precise nature of projective geometry (obeying rigid laws). To resolve this conundrum, we propose a "light touch" approach, guiding visual Transformers to learn multiple-view geometry but allowing them to break free when needed. We achieve this by using epipolar lines to guide the Transformer's cross-attention maps, penalizing attention values outside the epipolar lines and encouraging higher attention along these lines since they contain geometrically plausible matches. Unlike previous methods, our proposal does not require any camera pose information at test-time. We focus on pose-invariant object instance retrieval, where standard Transformer networks struggle, due to the large differences in viewpoint between query and retrieved images. Experimentally, our method outperforms state-of-the-art approaches at object retrieval, without needing pose information at test-time.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
在这项工作中,我们提出了一个新颖的观点,以解决贴片正确性评估的问题:正确的贴片实现了“答案”对越野车行为提出的问题的变化。具体而言,我们将贴片正确性评估变成一个问题回答问题。为了解决这个问题,我们的直觉是,自然语言处理可以提供必要的表示和模型来评估错误(问题)和补丁(答案)之间的语义相关性。具体而言,我们认为是输入错误报告以及生成的补丁的自然语言描述。我们的方法,Quatrain,首先考虑了最先进的消息生成模型,以生成与每个生成的补丁相关的相关输入。然后,我们利用神经网络体系结构来学习错误报告和提交消息之间的语义相关性。针对三个错误数据集生成的9135个补丁的大数据集(缺陷4J,Bugs.s.s.jar和Bears)的实验表明,Quatrain可以在预测补丁的正确性时达到0.886的AUC,并在过滤62%的62%错误的补丁时召回93%正确的补丁。我们的实验结果进一步证明了投入质量对预测性能的影响。我们进一步执行实验,以强调该模型确实了解了错误报告与预测的代码更改描述之间的关系。最后,我们与先前的工作进行比较,并讨论我们方法的好处。
translated by 谷歌翻译
监督的学习任务,例如GigaiPixel全幻灯片图像(WSIS)等癌症存活预测是计算病理学中的关键挑战,需要对肿瘤微环境的复杂特征进行建模。这些学习任务通常通过不明确捕获肿瘤内异质性的深层多企业学习(MIL)模型来解决。我们开发了一种新颖的差异池体系结构,使MIL模型能够将肿瘤内异质性纳入其预测中。说明了基于代表性补丁的两个可解释性工具,以探测这些模型捕获的生物学信号。一项针对癌症基因组图集的4,479吉普像素WSI的实证研究表明,在MIL框架上增加方差汇总可改善五种癌症类型的生存预测性能。
translated by 谷歌翻译
源代码的表示学习对于将机器学习应用于软件工程任务至关重要。已经显示,跨不同编程语言的学习代码表示比从单语言数据集中学习更有效,因为来自多语言数据集的更多培训数据可提高该模型从源代码中提取语言 - 不平衡信息的能力。但是,现有的多语言模型忽略了特定于语言的信息,这对于在多语言数据集中培训的下游任务至关重要,同时仅着眼于学习不同语言之间的共享参数。为了解决这个问题,我们提出了MetatPtrans,这是一种用于多语言代码表示学习的元学习方法。 MetAtPtrans根据输入源代码段的特定编程语言为特征提取器生成不同的参数,从而使模型能够同时学习语言 - 语言和特定于语言的信息。实验结果表明,MetAtPtrans可将最新方法的F1得分显着提高到2.40个百分点,以汇总代码摘要,这是一项语言不可或缺的任务;以及TOP-1(TOP-5)的预测准确性高达7.32(13.15)百分点,以完成代码完成,这是一项特定于语言的任务。
translated by 谷歌翻译
深层生成模型已成为检测数据中任意异常的有前途的工具,并分配了手动标记的必要性。最近,自回旋变压器在医学成像中取得了最先进的性能。但是,这些模型仍然具有一些内在的弱点,例如需要将图像建模为1D序列,在采样过程中误差的积累以及与变压器相关的显着推理时间。去核扩散概率模型是一类非自动回旋生成模型,最近显示出可以在计算机视觉中产生出色的样品(超过生成的对抗网络),并实现与变压器具有竞争力同时具有快速推理时间的对数可能性。扩散模型可以应用于自动编码器学到的潜在表示,使其易于扩展,并适用于高维数据(例如医学图像)的出色候选者。在这里,我们提出了一种基于扩散模型的方法,以检测和分段脑成像中的异常。通过在健康数据上训练模型,然后探索其在马尔可夫链上的扩散和反向步骤,我们可以识别潜在空间中的异常区域,因此可以确定像素空间中的异常情况。我们的扩散模型与一系列具有2D CT和MRI数据的实验相比,具有竞争性能,涉及合成和实际病理病变,推理时间大大减少,从而使它们的用法在临床上可行。
translated by 谷歌翻译